AITopics | event cluster

Collaborating Authors

event cluster

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

EventHunter: Dynamic Clustering and Ranking of Security Events from Hacker Forum Discussions

Ech-Chammakhy, Yasir, Motii, Anas, Rabii, Anass, Chbili, Jaafar

arXiv.org Artificial IntelligenceJul-15-2025

Hacker forums provide critical early warning signals for emerging cybersecurity threats, but extracting actionable intelligence from their unstructured and noisy content remains a significant challenge. This paper presents an unsupervised framework that automatically detects, clusters, and prioritizes security events discussed across hacker forum posts. Our approach leverages Transformer-based embeddings fine-tuned with contrastive learning to group related discussions into distinct security event clusters, identifying incidents like zero-day disclosures or malware releases without relying on predefined keywords. The framework incorporates a daily ranking mechanism that prioritizes identified events using quantifiable metrics reflecting timeliness, source credibility, information completeness, and relevance. Experimental evaluation on real-world hacker forum data demonstrates that our method effectively reduces noise and surfaces high-priority threats, enabling security analysts to mount proactive responses. By transforming disparate hacker forum discussions into structured, actionable intelligence, our work addresses fundamental challenges in automated threat detection and analysis.

data mining, large language model, machine learning, (25 more...)

arXiv.org Artificial Intelligence

2507.09762

Country:

Europe (1.00)
North America > Mexico (0.28)

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military > Cyberwarfare (0.70)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Social Media (1.00)
(5 more...)

Add feedback

The 2021 Tokyo Olympics Multilingual News Article Dataset

Novak, Erik, Calcina, Erik, Mladenić, Dunja, Grobelnik, Marko

arXiv.org Artificial IntelligenceFeb-13-2025

In this paper, we introduce a dataset of multilingual news articles covering the 2021 Tokyo Olympics. A total of 10,940 news articles were gathered from 1,918 different publishers, covering 1,350 sub-events of the 2021 Olympics, and published between July 1, 2021, and August 14, 2021. These articles are written in nine languages from different language families and in different scripts. To create the dataset, the raw news articles were first retrieved via a service that collects and analyzes news articles. Then, the articles were grouped using an online clustering algorithm, with each group containing articles reporting on the same sub-event. Finally, the groups were manually annotated and evaluated. The development of this dataset aims to provide a resource for evaluating the performance of multilingual news clustering algorithms, for which limited datasets are available. It can also be used to analyze the dynamics and events of the 2021 Tokyo Olympics from different perspectives. The dataset is available in CSV format and can be accessed from the CLARIN.SI repository.

data mining, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2502.06648

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.83)
North America > United States (0.05)
Europe > Switzerland > Basel-City > Basel (0.04)
Europe > Slovenia > Central Slovenia > Municipality of Ljubljana > Ljubljana (0.04)

Genre: Research Report (0.64)

Industry: Leisure & Entertainment > Sports > Olympic Games (1.00)

Technology:

Information Technology > Communications (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.69)
Information Technology > Data Science > Data Mining (0.69)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.68)

Add feedback

MrSteve: Instruction-Following Agents in Minecraft with What-Where-When Memory

Park, Junyeong, Cho, Junmo, Ahn, Sungjin

arXiv.org Artificial IntelligenceDec-25-2024

Significant advances have been made in developing general-purpose embodied AI in environments like Minecraft through the adoption of LLM-augmented hierarchical approaches. While these approaches, which combine high-level planners with low-level controllers, show promise, low-level controllers frequently become performance bottlenecks due to repeated failures. In this paper, we argue that the primary cause of failure in many low-level controllers is the absence of an episodic memory system. To address this, we introduce MrSteve (Memory Recall Steve-1), a novel low-level controller equipped with Place Event Memory (PEM), a form of episodic memory that captures what, where, and when information from episodes. This directly addresses the main limitation of the popular low-level controller, Steve-1. Unlike previous models that rely on short-term memory, PEM organizes spatial and event-based data, enabling efficient recall and navigation in long-horizon tasks. Additionally, we propose an Exploration Strategy and a Memory-Augmented Task Solving Framework, allowing agents to alternate between exploration and task-solving based on recalled events. Our approach significantly improves task-solving and exploration efficiency compared to existing methods. We will release our code and demos on the project page: https://sites.google.com/view/mr-steve.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2411.06736

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > Sweden > Skåne County > Malmö (0.04)
Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report > New Finding (0.46)

Industry:

Leisure & Entertainment > Games > Computer Games (0.61)
Health & Medicine (0.54)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Games > Computer Games (0.61)

Add feedback

A Novel End-To-End Event Geolocation Method Leveraging Hyperbolic Space and Toponym Hierarchies

Qiao, Yaqiong, Huang, Guojun

arXiv.org Artificial IntelligenceDec-14-2024

Abstract: Timely detection and geolocation of events based on social data can provide critical information for applications such as crisis response and resource allocation. However, most existing methods are greatly affected by event detection errors, leading to insufficient geolocation accuracy. To this end, this paper proposes a novel end-to-end event geolocation method (GTOP) leveraging Hyperbolic space and toponym hierarchies. Specifically, the proposed method contains one event detection module and one geolocation module. The event detection module constructs a heterogeneous information networks based on social data, and then constructs a homogeneous message graph and combines it with the text and time feature of the message to learning initial features of nodes. Node features are updated in Hyperbolic space and then fed into a classifier for event detection. To reduce the geolocation error, this paper proposes a noise toponym filtering algorithm (HIST) based on the hierarchical structure of toponyms. HIST analyzes the hierarchical structure of toponyms mentioned in the event cluster, taking the highly frequent city-level locations as the coarsegrained locations for events. To further improve the geolocation accuracy, we propose a fine-grained pseudo toponyms generation algorithm (FIT) based on the output of HIST, and combine generated pseudo toponyms with filtered toponyms to locate events based on the geographic center points of the combined toponyms. Extensive experiments are conducted on the Chinese dataset constructed in this paper and another public English dataset. The experimental results show that the proposed method is superior to the state-of-the-art baselines.

data mining, machine learning, toponym, (19 more...)

arXiv.org Artificial Intelligence

2412.1087

Country:

North America > United States > Texas > Dallas County > Dallas (0.04)
Asia > China > Zhejiang Province > Hangzhou (0.04)
South America > Brazil > Ceará > Fortaleza (0.04)
(15 more...)

Genre: Research Report > New Finding (0.66)

Industry:

Information Technology (0.68)
Law (0.68)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
(3 more...)

Add feedback

Large Language Model Enhanced Clustering for News Event Detection

Tarekegn, Adane Nega

arXiv.org Artificial IntelligenceJul-6-2024

The news landscape is continuously evolving, with an ever-increasing volume of information from around the world. Automated event detection within this vast data repository is essential for monitoring, identifying, and categorizing significant news occurrences across diverse platforms. This paper presents an event detection framework that leverages Large Language Models (LLMs) combined with clustering analysis to detect news events from the Global Database of Events, Language, and Tone (GDELT). The framework enhances event clustering through both pre-event detection tasks (keyword extraction and text embedding) and post-event detection tasks (event summarization and topic labelling). We also evaluate the impact of various textual embeddings on the quality of clustering outcomes, ensuring robust news categorization. Additionally, we introduce a novel Cluster Stability Assessment Index (CSAI) to assess the validity and robustness of clustering results. CSAI utilizes multiple feature vectors to provide a new way of measuring clustering quality. Our experiments indicate that the use of LLM embedding in the event detection framework has significantly improved the results, demonstrating greater robustness in terms of CSAI scores. Moreover, post-event detection tasks generate meaningful insights, facilitating effective interpretation of event clustering results. Overall, our experimental results indicate that the proposed framework offers valuable insights and could enhance the accuracy in news analysis and reporting.

algorithm, csai score, llm, (13 more...)

arXiv.org Artificial Intelligence

2406.10552

Country:

North America > United States > Colorado (0.04)
Europe > Norway > Western Norway > Vestland > Bergen (0.04)

Genre: Research Report > New Finding (0.88)

Industry: Media > News (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

A method for incremental discovery of financial event types based on anomaly detection

Gu, Dianyue, Li, Zixu, Guan, Zhenhai, Zhang, Rui, Huang, Lan

arXiv.org Artificial IntelligenceFeb-16-2023

Event datasets in the financial domain are often constructed based on actual application scenarios, and their event types are weakly reusable due to scenario constraints; at the same time, the massive and diverse new financial big data cannot be limited to the event types defined for specific scenarios. This limitation of a small number of event types does not meet our research needs for more complex tasks such as the prediction of major financial events and the analysis of the ripple effects of financial events. In this paper, a three-stage approach is proposed to accomplish incremental discovery of event types. For an existing annotated financial event dataset, the three-stage approach consists of: for a set of financial event data with a mixture of original and unknown event types, a semi-supervised deep clustering model with anomaly detection is first applied to classify the data into normal and abnormal events, where abnormal events are events that do not belong to known types; then normal events are tagged with appropriate event types and abnormal events are reasonably clustered. Finally, a cluster keyword extraction method is used to recommend the type names of events for the new event clusters, thus incrementally discovering new event types. The proposed method is effective in the incremental discovery of new event types on real data sets.

data mining, event type, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2302.08205

Country:

Asia > China > Shanghai > Shanghai (0.04)
Asia > China > Jilin Province > Changchun (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)

Genre: Research Report (1.00)

Industry: Banking & Finance (0.46)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

Hierarchical Capsule Prediction Network for Marketing Campaigns Effect

Chu, Zhixuan, Ding, Hui, Zeng, Guang, Huang, Yuchen, Yan, Tan, Kang, Yulin, Li, Sheng

arXiv.org Artificial IntelligenceAug-22-2022

Marketing campaigns are a set of strategic activities that can promote a business's goal. The effect prediction for marketing campaigns in a real industrial scenario is very complex and challenging due to the fact that prior knowledge is often learned from observation data, without any intervention for the marketing campaign. Furthermore, each subject is always under the interference of several marketing campaigns simultaneously. Therefore, we cannot easily parse and evaluate the effect of a single marketing campaign. To the best of our knowledge, there are currently no effective methodologies to solve such a problem, i.e., modeling an individual-level prediction task based on a hierarchical structure with multiple intertwined events. In this paper, we provide an in-depth analysis of the underlying parse tree-like structure involved in the effect prediction task and we further establish a Hierarchical Capsule Prediction Network (HapNet) for predicting the effects of marketing campaigns. Extensive results based on both the synthetic data and real data demonstrate the superiority of our model over the state-of-the-art methods and show remarkable practicability in real industrial applications.

capsule, event cluster, marketing campaign, (12 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3511808.3557099

2208.10113

Country:

Asia > China > Zhejiang Province > Hangzhou (0.04)
North America > United States > Virginia > Albemarle County > Charlottesville (0.04)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report > Experimental Study (0.68)

Industry:

Marketing (1.00)
Banking & Finance (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.69)
Information Technology > Artificial Intelligence > Natural Language (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Empirical Study on Detecting Controversy in Social Media

Nematzadeh, Azadeh, Bang, Grace, Liu, Xiaomo, Ma, Zhiqiang

arXiv.org Machine LearningAug-25-2019

Companies and financial investors are paying increasing attention to social consciousness in developing their corporate strategies and making investment decisions to support a sustainable economy for the future. Public discussion on incidents and events -- controversies -- of companies can provide valuable insights on how well the company operates with regards to social consciousness and indicate the company's overall operational capability. However, there are challenges in evaluating the degree of a company's social consciousness and environmental sustainability due to the lack of systematic data. We introduce a system that utilizes Twitter data to detect and monitor controversial events and show their impact on market volatility. In our study, controversial events are identified from clustered tweets that share the same 5W terms and sentiment polarities of these clusters. Credible news links inside the event tweets are used to validate the truth of the event. A case study on the Starbucks Philadelphia arrests shows that this method can provide the desired functionality.

artificial intelligence, natural language, tweet, (14 more...)

arXiv.org Machine Learning

1909.01093

Country:

North America > United States > Indiana (0.14)
North America > United States > Alaska (0.14)

Genre: Research Report > New Finding (0.48)

Industry:

Banking & Finance > Trading (1.00)
Information Technology (0.90)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

Beyond Trending Topics: Real-World Event Identification on Twitter

Becker, Hila (Columbia University) | Naaman, Mor (Rutgers University) | Gravano, Luis (Columbia University)

AAAI ConferencesJul-12-2011

User-contributed messages on social media sites such as Twitter have emerged aspowerful, real-time means of information sharing on the Web. These short messages tend to reflect a variety of events in real time, making Twitter particularly well suited as a source of real-time event content. In this paper, we explore approaches for analyzing the stream of Twitter messages to distinguish between messages about real-world events andnon-event messages. Our approach relies on a rich family of aggregatestatistics of topically similar message clusters. Large-scale experiments over millions of Twitter messages show the effectiveness of our approach for surfacing real-world event content on Twitter.

artificial intelligence, classifier, machine learning, (18 more...)

AAAI Conferences

Fifth International AAAI Conference on Weblogs and Social Media

Country:

North America > United States > New York (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Asia > Middle East > Iran (0.04)
Asia > China (0.04)

Industry: Information Technology > Services (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.47)

Add feedback